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(54) Indexing of recordings 

(57) A recording is indexed by keywords. In order to 
perform the indexing, an audio portion (12) of the re- 
cording is transcribed (31 ) to produce text in a text file. 
A time stamp (32) is associated with each word in the 
text. Each time stamp (32) indicates a time in the record- 
ing at which occurs an associated word. Once a record- 
ing has been Indexed, the recording may be searched 
along with other recordings. For example, in response 
to a user choosing a keyword (46), a text file for each 
recording is searched for occurrences of the keyword 
(46). At the conclusion of the search, each recording 
which includes an occurrence of the keyword is listed 
(42). When a user selects (42) a first recording and a 
particular occurrence of the keyword (46), the first re- 
cording Is played starting slightly before a time corre- 
sponding to a first time stamp associated with the par- 
ticular occurrence of the keyword in the first recording. 
In response to control sequences, prior and next occur- 
rences of the keyword (42) can be obsen/ed in one or 
multiple recordings. 
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searching of recordings, improving over other currently 
available schemes to index recordings. 

An embodiment of the present invention is de- 
scribed below, by way of example only, with reference 
to the accompanying drawings, in which: 

Figure 1 illustrates steps taken to allow keyword in- 
dexing of digital recordings in accordance with the pre- 
ferred embodiment. 

Figure 2 is a flowchart which shows steps by which 
text for a digital recording is keyword indexed in accord- 
ance with the preferred embodiment. 

Figure 3 and Figure 4 show computing displays 
which illustrate the preparation of a data base used for 
keyword indexing of digital recordings in accordance 
with the preferred embodiment. 

Figure 5 shows a computing display used for key- 
word index searches of a video library in accordance 
with the preferred embodiment. 

Figure 6 shows a computing display used for key- 
word index searches of a video library in accordance 
with an alternate embodiment. 

Figure 1 Illustrates steps taken to allow keyword in- 
dexing of digital recordings. A recording source 11 is dig- 
itized and compressed to produce digitized recording 
file 13. Recording source 11 is, for example, an audio 
recording or an audio-video recording. When recording 
source 11 Is an audio-video recording, data in digitized 
recording file 13 is, for example, stored in I^PEG-1 for- 
mat. Digitized recording file 13 may be produced from 
analog recording source 11 using, for example, OptiV- 
ideo MPEG 1 Encoder available from OptiVision, having 
a business address of 3450 Hlllview Ave., Palo Alto, CA 
94304. 

In addition, the audio portion of recording source 1 1 
is transcribed to produce a text file 12 which includes 
the text. The transcription may be perfomied manually. 
Alternately, the audio portion of recording source 1 1 may 
be transcribed directly from recording source 11 or dig- 
itized recording file 13 using computerized speech rec- 
ognition technology such as DragonDictate for Windows 
available from Dragon Systems, Inc., having a business 
address of 320 Nevada Street, Newton, MA 021 60. Text 
file 12 and digitized recording file 13 are then made 
available to a computer system 14. 

Figure 2 is a flowchart which shows steps by which 
text tor a digitized recording file 13 Is keyword indexed. 
In a step 31 , text is produced which is the audio portion 
of digitized recording file 13. This text is a result of the 
transcription described above. 

Figure 3 illustrates the result of the transcription 
process. Figure 3 shows a window 23 In a computer 
screen 21. Within window 23 Is the transcribed text of 
the audio portion of recording file 1 3. 

In a step 32. shown in Figure 2. time stamps asso- 
ciated with words In the text are added to the transcribed 
text. In the preferred embodiment, the time stamps are 
in milliseconds and indicate elapse of time relative to the 
starting point of the digital recording within recording file 



13. 

Placement of time stamps may be performed, for 
example, with the help of an operator utilizing, on com- 
puter 14 (shown in Figure 1), software specifically de- 

s signed to add time stamps. For example, the recording 
is played by computer 14. For an audio-video recording, 
a window 22 in computer screen 21 , as shown in Figure 
3, may be added in which the audio-video recording Is 
played. The operator of computer 14, using cursor 24. 

10 selects words as they are spoken in the recording 
played by computer 14. Whenever the operator selects 
with cursor 24 a word from the text in window 23. the 
software running on computer 14 time stamps the word 
with the current time duration which represents the 

IS elapse of time relative to the starting point of the digital 
recording. 

Figure 4 further illustrates this process. In Figure 4, 
time stamps TS1 , TS2 and TS3 have been added to text 
23 by an operator as described above. Source code for 
20 software which implements the lime stamp feature dis- 
cussed above for audio-video recordings will be appar- 
ent to the skilled person. Alternately, step 32, shown in 
Figure 2, may be automated so that speech recognition 
technobgy is used to trigger the placement of time 
2S stamps within text 23. 

After the time stamps have been added to text 23, 
in a step 33 shown in Figure 2, every word of text 23 is 
assigned a time stamp. For words which were not as- 
signed a time stamp in step 32, interpolation is used to 
30 detenmine an appropriate time stamp. 

For example, Table 1 below shows a portion of text 
23 after the completion of step 32. 

Table 1 

35 

Once::11 upon a time: :20 there was a boy::28 
named Fred. He wenl::35 to the forest::44. . . . 
In the example given in Table 1, the word "Once" was 
spoken at 11 milliseconds from the beginning of the au- 

^0 dio track of the digital recording. The word "time" was 
spoken at 20 milliseconds from the beginning of the au- 
dio track of the digital recording. The word "boy" was 
spoken at 28 milliseconds from the beginning of the au- 
dio track of the digital recording. The word "went" was 

^ spoken at 35 milliseconds from the beginning of the au- 
dio track of the digital recording. The word "forest" was 
spoken at 44 milliseconds from the beginning of the au- 
dio track of the digital recording. 

In order to assign time stamps to the remainder of 

so the words, interpolation is used. For example, nine mil- 
liseconds elapsed between the word "Once" and the 
word "time". There are two words, "upon", and "a", which 
occur between "Once" and lime". As a result of the in- 
terpolation, the words "upon", and "a" are assigned time 

ss stamps of 1 4 milliseconds and 1 7 milliseconds, respec- 
tively. This Is done so that there is allocated three milli- 
seconds between the occurrence of the word "Once" 
and the word "upon"; there is allocated three millisec- 
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rence of the keyword, go back to the last occurrence of 
the keyword, continue playing and so on. The interface 
also includes a "cancel' button 49. 

In addition to searching on one or more keywords 
connected by Boolean variables, the balanced tree 
formed in step 34 (shown in Figure 2) may also be 
searched using concept based searching techniques, 
for example using Metamorph available from Thunder- 
stone Software-EPI, Inc. having a business address of 
11115 Edgewater Drive, Cleveland, Ohio 44102. 

The foregoing discussion discloses and describes 
merely exemplary methods and embodiments. 

The disclosures in United States patent application 
no. 08/576,106, from which this application claims pri- 
ority, and in the abstract accompanying this application 
are incorporated herein by reference. The US parent ap- 
plication also includes examples of the source codes 
mentbned herein. 



Claims 

1. A method of indexing a recording comprising the 
steps of: 

(a) transcribing (31) an audio portion of the re- 
cording to produce text in a text file; and, 

(b) providing (32) for each of a set of words in 
the text, a time stamp which indicates a time in 
the recording at which each word in the set of 
words occurs. 

2. A method as In claim 1 wherein step (a) is accom- 
plished manually by a transcriber or with the use of 
speech recognition technology. 

3. A method as in claim 2 wherein when step (a) is 
accomplished with the use of speech recognition 
technology, steps (a) and (b) are performed simul- 
taneously. 

4. A method as in claim 1 , 2 or 3, wherein step (b) in- 
cludes the substeps of: 

(b.1) providing for each of a subset of the set 
of words in the text, a time stamp which indi- 
cates a time In the recording at which each word 
in the subset of the set of words occurs; and, 
(b.2) for a remainder of the set of words which 
are not in the subset of the set of words, using 
interpolation to provide a time stamp which in- 
dicates a time in the recording at which each 
word in the remainder of the set of words oc- 
curs. 

5. A method as in claim 4 wherein the recording is an 
audio-video recording and wherein substep (b.1 ) in- 
cludes the substeps of: 



(b. 1 . 1 ) displaying the text in a first window (23) 
of a computer display; 

(b.1 .2) playing a video portion of the recording 
in a second window (22) of the computer dis- 
5 play; and, 

(b.1. 3) upon an operator selecting a selected 
word of the text in the first window, adding a 
time stamp (TS1...) to the text file which indi- 
cates an elapsed time from a beginning of the 
recording until selection by the operator of the 
selected word. 

6. A method as In any preceding claim, comprising the 
step of: 

(c) arranging the set of words and associated 
time stamps into a balanced tree based on occur- 
rences of each word in the set of words. 

7.. A method of accessing selections within a plurality 
20 of recordings, comprising the steps of: 

(a) in response to a user choosing a keyword, 
searching a plurality of text files for occurrences 
of the keyword, wherein text files are associat- 
es ed with recordings so that for each of the plu- 
rality of recordings, one text file from the plural- 
ity of text files includes a text of an audio portion 
of the recording, each word in each text file be- 
ing associated with a time stamp (TS1 ...) which 

30 indicates an approximate location in an associ- 

ated recording of an occurrence of the word; 

(b) listing (44) recordings which include an oc- 
currence of the keyword; and, 

(c) upon a user selecting a first recording and 
55 a particular occurrence of the keyword, playing 

the first recording starting slightly before a time 
corresponding to a first time stamp associated 
with the particular occurrence of the keyword in 
the first recording. 

40 

8. A method as in claim 7 wherein in step (c) upon a 
user selecting the first recording, a first-in-time oc- 
currence of the keyword within the first recording is 
automatically selected as the particular occurrence 

45 of the keyword. 

9. A method as in claim 7 or 8 wherein step (b) in- 
cludes the substeps of: 

50 (b.1) listing in a first window the recordings 

which include an occurrence of the keyword; 
(b.2) highlighting one of the recordings from the 
recordings listed in the first window; and, 
(b.3) listing each of the occurrences of the key- 
55 word within the recording highlighted in substep 

(b.2). 

10. A system for accessing selectbns within a plurality 
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